Add rhos-mcps support by Akrog · Pull Request #124 · openstack-lightspeed/operator

Akrog · 2026-06-24T14:43:50Z

This PR adds support for the rhos-mcps image that provides openstack-cli and openshift-cli tools.

The MCP server is deployed as a sidecar container with the openshift-cli tool always enabled and the openstack-cli tool only enabled once the OpenStackControlPlane is ready.

By default the rhos-mcps is not deployed and requires the rhos_mcps feature flag to be enabled.

Summary by CodeRabbit

New Features
- Added support for optional MCP server integration, including new runtime configuration, related images, and deployment settings.
- Exposed new status indicators so users can see MCP server readiness and credential-secret tracking.
- Updated the sample configuration to include MCP-related options.
Bug Fixes
- Improved reconciliation to react to supporting resource changes more reliably.
- Expanded cleanup handling so MCP-related resources are removed during deletion.

Align with the version used by the openstack-k8s-operators.

openshift-ci · 2026-06-24T14:43:58Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Akrog

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [Akrog]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-06-24T14:44:04Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2e2a5b8a-d122-4b94-856e-6451a3f74428

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 15

🧹 Nitpick comments (3)

test/kuttl/tests/rhos-mcps-configuration/04-disable-rhos-mcps.yaml (1)
21-22: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add an explicit post-disable assertion for sidecar removal.

This step disables the flag, but this scenario only verifies mcp-config cleanup later. Add an assertion step that lightspeed-stack-deployment no longer includes rhos-mcp to prevent toggle regressions from passing unnoticed.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/kuttl/tests/rhos-mcps-configuration/04-disable-rhos-mcps.yaml` around
lines 21 - 22, Add an explicit assertion after disabling the feature flag to
verify the sidecar is removed from the deployment. Update the KUTTL scenario
around the rhos-mcps toggle so it checks that lightspeed-stack-deployment no
longer contains rhos-mcp, in addition to the existing mcp-config cleanup
validation, to catch regressions in the disable path.
test/kuttl/common/mock-openstack/mock-keystone.yaml (1)
118-118: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Pin the mock image instead of using :latest.

A floating tag makes this KUTTL fixture nondeterministic and can break CI when the base image changes underneath the test. Please pin a specific tag or digest here so the mock environment stays reproducible.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/kuttl/common/mock-openstack/mock-keystone.yaml` at line 118, The mock
Keystone fixture is using a floating image tag, which makes the KUTTL test
environment nondeterministic. Update the image reference in the mock OpenStack
YAML to a pinned version or digest instead of
registry.redhat.io/ubi8/python-311:latest, keeping the mock setup reproducible
across CI runs.
test/kuttl/common/mock-openstack/mock-oscp-resources.yaml (1)
84-86: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Model OpenStackControlPlane readiness in the mock status.

The PR contract says MCP enablement waits for OpenStackControlPlane to become ready, but this fixture only provides status.tls.caBundleSecretName. As written, the KUTTL flow can pass without ever exercising the ready-gated path. Add a ready condition here, or split the scenario into pre-ready and post-ready assertions.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/kuttl/common/mock-openstack/mock-oscp-resources.yaml` around lines 84 -
86, The mock OpenStackControlPlane status only sets
status.tls.caBundleSecretName, so the readiness-gated path in the MCP enablement
flow is never exercised. Update the mock status fixture used by
OpenStackControlPlane to include a ready condition in addition to the TLS
bundle, using the existing OpenStackControlPlane status shape so the KUTTL
scenario can assert both pre-ready and post-ready behavior if needed.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/v1beta1/openstacklightspeed_types.go`:
- Around line 49-50: The MCP sidecar image is still defaulting to a mutable
“latest” tag, which means deployments can drift from the tested binary. Update
the default image constant in openstacklightspeed_types.go and ensure the same
pinned, immutable release image is used in manager.yaml and the generated CSV so
all install and upgrade paths reference the exact released MCP server image.

In `@cmd/main.go`:
- Around line 193-197: The `getDynamicWatchCRDs` call site still treats the
function as returning an error even though it never does, which triggers the
`unparam` lint issue. Update `getDynamicWatchCRDs` to remove the dead error
return, then simplify its callers in `main` and any other affected paths
(including the other mentioned block) to stop checking `err` and just use the
returned CRD value(s) directly.

In `@config/rbac/role.yaml`:
- Around line 15-26: The ClusterRole in role.yaml is granting broad cluster-wide
Secret write/read permissions that override the intended `resourceNames`
restriction for `pull-secret`. Tighten the `secrets` rule in this ClusterRole to
only the minimal cluster-scoped Secret access needed for MCP/Keystone, and move
any Secret create/update/patch/delete permissions to the namespaced Role used
for managed Secrets. Keep the existing `resourceNames`-based access for
`pull-secret` and verify the `ClusterRole` versus `Role` split preserves only
the required external reads.

In `@config/samples/api_v1beta1_openstacklightspeed.yaml`:
- Around line 18-21: The sample has a duplicate commented dev configuration
block that can lead users to create two dev keys when enabling both examples.
Update the OpenStack Lightspeed sample so the rhos_mcps option is folded into
the existing commented dev.featureFlags example rather than adding a second dev
section, using the dev and featureFlags sample entries to keep the structure
unambiguous.

In `@internal/controller/common.go`:
- Around line 292-301: GetDeploymentVolumeSection currently takes
appsv1.Deployment by value, so the returned *corev1.Volume points into a copied
Deployment rather than the caller’s object. Update GetDeploymentVolumeSection to
accept a *appsv1.Deployment (add a nil guard) and return the matching volume
from deployment.Spec.Template.Spec.Volumes so any caller mutation affects the
original Deployment.

In `@internal/controller/mcp_server.go`:
- Around line 794-820: The cleanup in mcp_server.go is deleting broadly named
ConfigMaps/Secrets without verifying they belong to this operator instance,
which can remove pre-existing or user-managed resources; update the deletion
logic around the MCP cleanup block and the related password Secret cleanup to
first check ownership via labels or controller metadata before calling
helper.GetClient().Delete. Use the existing resource symbols
MCPConfigYAMLConfigMapName, CloudsYAMLConfigMapName, SecureYAMLSecretName,
CombinedCABundleSecretName, and the lightspeed-password Secret cleanup path to
ensure only operator-managed objects are removed, especially for the
cross-namespace Secret where owner references may not be safe.
- Around line 514-516: The OpenStack MCP setup flow in mcp_server.go is treating
a Secret as ready even when AC_ID or AC_SECRET are missing, which can write
empty application-credential values into secure.yaml. Update the logic around
the return in the application credential handling path to verify both secret
keys exist and contain non-empty values before marking the secret ready or
returning success; if either key is missing, keep waiting/retrying instead of
proceeding.
- Around line 348-350: The password Secret lookup in mcp_server.go currently
treats any existing Secret as valid, even when LightspeedPasswordSecretName is
present without LightspeedPasswordSecretKey and
string(sec.Data[LightspeedPasswordSecretKey]) becomes empty. Update the
secret-handling logic in the password retrieval path to explicitly verify the
key exists and has a non-empty value before returning it; if the Secret is
missing the key, return an error instead of an empty password so the Keystone
user is not created or reconciled with a blank credential.

In `@internal/controller/openstacklightspeed_controller.go`:
- Around line 162-168: The polling in the OpenStack control flow stops too early
for cross-namespace instances, so readiness and endpoint/CA updates can be
missed after discovery. Update the logic around the `oscpWatch` check in
`openstacklightspeed_controller.go` so `RequeueAfter` continues until a real
cross-namespace watch is established, or add watch coverage for
`OpenStackControlPlane` changes outside the operator namespace; ensure the
reconcile path that handles `OpenStackReady` still requeues or watches the
relevant `OpenStackControlPlane` updates.
- Around line 479-518: Keep the DynamicWatchCRD flag latched once a watch has
been registered in WatchDynamicCRD. The current Reconcile-driven loop in
openstacklightspeed_controller.go clears seen when IsCRDEstablished reports the
CRD is gone, which can allow r.controller.Watch(...) to be registered again
later even though there is no unregister path. Update the logic around the
DynamicWatchCRD/seen handling so a successful watch registration stays marked
true and does not attempt duplicate setup on subsequent reconciles.

In `@test/kuttl/tests/application-credentials/04-assert-mock-openstack.yaml`:
- Around line 4-15: The step timeout is too tight for the commands in the mock
OpenStack assertion script, which can cause flakes when the waits run close to
their limits. Increase the timeout in the test step so the combined `oc wait`
and `oc get` operations in the `commands` block have enough headroom; update the
`timeout` setting in this KUTTL test manifest accordingly.

In `@test/kuttl/tests/application-credentials/06-assert-ac-resources.yaml`:
- Around line 10-42: The wait loops in the application-credentials assertion
script can take longer than the step timeout, so the test may be terminated
before finishing its checks. Update the timeout setting in the same script to
exceed the worst-case combined wait time used by the secret and AC CR polling
loops, and keep the timeout aligned with the logic in the shell block that waits
for lightspeed-password and the
keystoneapplicationcredentials.keystone.openstack.org resource.

In `@test/kuttl/tests/application-credentials/08-assert-mcp-credentials.yaml`:
- Around line 42-64: The checks in the application-credentials KUTTL assertion
script read `openstack-config-secret` and `mcp-config` only once, which can race
with resource availability. Update the verification flow in this test to poll
with retries and short waits before asserting on `SECURE` and `MCP_CFG`,
re-fetching the Secret and ConfigMap until the expected keys are present or a
timeout is reached. Keep the existing `oc get`/`grep` assertions in place, but
wrap them in retry logic so the `clouds.yaml`-driven readiness is handled
reliably.

In `@test/kuttl/tests/application-credentials/12-cleanup-mock-openstack.yaml`:
- Around line 13-15: The teardown in the cleanup step uses a fixed 60-second
wait for namespace deletion, which can make the scenario flaky on slower
clusters. Update the cleanup logic around the namespace deletion command in the
test to avoid relying on a hard `--timeout=60s`; instead use delete plus polling
for completion, or increase it to a safer bounded wait before removing the CRDs.

In
`@test/kuttl/tests/persistent-database/04-assert-openstack-lightspeed-instance.yaml`:
- Around line 51-52: The OpenStackLightspeedMCPServerReady assertion only checks
for a True status, so it can pass even when the MCP is enabled through an
unintended path. Update the existing assertion in this persistent-database test
to also verify the disabled message, matching the other suite checks, so the
condition is specifically validating the expected disabled-state behavior rather
than any True result.

---

Nitpick comments:
In `@test/kuttl/common/mock-openstack/mock-keystone.yaml`:
- Line 118: The mock Keystone fixture is using a floating image tag, which makes
the KUTTL test environment nondeterministic. Update the image reference in the
mock OpenStack YAML to a pinned version or digest instead of
registry.redhat.io/ubi8/python-311:latest, keeping the mock setup reproducible
across CI runs.

In `@test/kuttl/common/mock-openstack/mock-oscp-resources.yaml`:
- Around line 84-86: The mock OpenStackControlPlane status only sets
status.tls.caBundleSecretName, so the readiness-gated path in the MCP enablement
flow is never exercised. Update the mock status fixture used by
OpenStackControlPlane to include a ready condition in addition to the TLS
bundle, using the existing OpenStackControlPlane status shape so the KUTTL
scenario can assert both pre-ready and post-ready behavior if needed.

In `@test/kuttl/tests/rhos-mcps-configuration/04-disable-rhos-mcps.yaml`:
- Around line 21-22: Add an explicit assertion after disabling the feature flag
to verify the sidecar is removed from the deployment. Update the KUTTL scenario
around the rhos-mcps toggle so it checks that lightspeed-stack-deployment no
longer contains rhos-mcp, in addition to the existing mcp-config cleanup
validation, to catch regressions in the disable path.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ecbaef71-33aa-44a9-a9ab-3a2bd4b1ea79

📥 Commits

Reviewing files that changed from the base of the PR and between 6a09734 and f0918cb.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (71)

.github/workflows/build-and-push.yaml
.github/workflows/verify-generation.yaml
Makefile
api/v1beta1/conditions.go
api/v1beta1/openstacklightspeed_types.go
bundle.Dockerfile
bundle/manifests/lightspeed.openstack.org_openstacklightspeeds.yaml
bundle/manifests/openstack-lightspeed-operator.clusterserviceversion.yaml
bundle/metadata/annotations.yaml
cmd/main.go
config/crd/bases/lightspeed.openstack.org_openstacklightspeeds.yaml
config/manager/manager.yaml
config/manifests/bases/openstack-lightspeed-operator.clusterserviceversion.yaml
config/rbac/role.yaml
config/samples/api_v1beta1_openstacklightspeed.yaml
go.mod
internal/controller/assets/mcp_server_config.yaml.tmpl
internal/controller/common.go
internal/controller/constants.go
internal/controller/lcore_config.go
internal/controller/lcore_deployment.go
internal/controller/mcp_server.go
internal/controller/mcp_server_test.go
internal/controller/openstacklightspeed_controller.go
internal/controller/openstacklightspeed_controller_test.go
internal/controller/postgres_reconciler.go
test/kuttl/common/expected-configs/lightspeed-stack-okp.yaml
test/kuttl/common/expected-configs/lightspeed-stack-update.yaml
test/kuttl/common/expected-configs/lightspeed-stack.yaml
test/kuttl/common/mock-openstack/assert-mock-openstack.yaml
test/kuttl/common/mock-openstack/cleanup-mock-openstack.yaml
test/kuttl/common/mock-openstack/errors-mock-openstack.yaml
test/kuttl/common/mock-openstack/kac-crd.yaml
test/kuttl/common/mock-openstack/mock-keystone.yaml
test/kuttl/common/mock-openstack/mock-oscp-resources.yaml
test/kuttl/common/mock-openstack/oscp-crd.yaml
test/kuttl/common/mock-openstack/simulate-keystone-operator.sh
test/kuttl/common/openstack-lightspeed-instance/assert-lightspeed-stack-config.yaml
test/kuttl/common/openstack-lightspeed-instance/assert-mcp-config.yaml
test/kuttl/common/openstack-lightspeed-instance/assert-openstack-lightspeed-instance.yaml
test/kuttl/common/openstack-lightspeed-instance/errors-openstack-lightspeed-instance.yaml
test/kuttl/tests/application-credentials/00-mock-resources.yaml
test/kuttl/tests/application-credentials/01-assert-mock-objects-created.yaml
test/kuttl/tests/application-credentials/02-create-openstack-crds.yaml
test/kuttl/tests/application-credentials/03-create-mock-openstack.yaml
test/kuttl/tests/application-credentials/04-assert-mock-openstack.yaml
test/kuttl/tests/application-credentials/05-create-openstack-lightspeed-instance.yaml
test/kuttl/tests/application-credentials/06-assert-ac-resources.yaml
test/kuttl/tests/application-credentials/07-simulate-keystone-operator.yaml
test/kuttl/tests/application-credentials/08-assert-mcp-credentials.yaml
test/kuttl/tests/application-credentials/09-cleanup-openstack-lightspeed-instance.yaml
test/kuttl/tests/application-credentials/10-assert-ac-cleanup.yaml
test/kuttl/tests/application-credentials/11-errors-openstack-lightspeed-instance.yaml
test/kuttl/tests/application-credentials/12-cleanup-mock-openstack.yaml
test/kuttl/tests/application-credentials/13-assert-mock-openstack.yaml
test/kuttl/tests/application-credentials/14-cleanup-mock-objects.yaml
test/kuttl/tests/application-credentials/15-errors-mock-objects.yaml
test/kuttl/tests/okp-configuration/03-assert-okp-instance.yaml
test/kuttl/tests/persistent-database/04-assert-openstack-lightspeed-instance.yaml
test/kuttl/tests/rhos-mcps-configuration/00-mock-resources.yaml
test/kuttl/tests/rhos-mcps-configuration/01-assert-mock-objects-created.yaml
test/kuttl/tests/rhos-mcps-configuration/02-create-rhos-mcps-resources.yaml
test/kuttl/tests/rhos-mcps-configuration/03-assert-rhos-mcps-instance.yaml
test/kuttl/tests/rhos-mcps-configuration/04-disable-rhos-mcps.yaml
test/kuttl/tests/rhos-mcps-configuration/05-errors-rhos-mcps-cleanup.yaml
test/kuttl/tests/rhos-mcps-configuration/06-cleanup-openstack-lightspeed-instance.yaml
test/kuttl/tests/rhos-mcps-configuration/07-errors-openstack-lightspeed-instance.yaml
test/kuttl/tests/rhos-mcps-configuration/08-cleanup-mock-objects.yaml
test/kuttl/tests/rhos-mcps-configuration/09-errors-mock-objects.yaml
test/kuttl/tests/update-openstacklightspeed/08-assert-openstacklightspeed-update.yaml
test/kuttl/tests/update-openstacklightspeed/11-assert-configmaps-update.yaml

coderabbitai · 2026-06-24T15:22:29Z

+	// MCPServerContainerImage is the fall-back container image for the MCP server
+	MCPServerContainerImage = "quay.io/openstack-lightspeed/rhos-mcps:latest"


🩺 Stability & Availability | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

sed -n '1,120p' api/v1beta1/openstacklightspeed_types.go && printf '\n---\n' && sed -n '300,360p' api/v1beta1/openstacklightspeed_types.go

Repository: openstack-lightspeed/operator

Length of output: 6783

🏁 Script executed:

rg -n "MCPServerContainerImage|RELATED_IMAGE_MCP_SERVER_IMAGE_URL_DEFAULT|latest|rhos-mcps" api/v1beta1/openstacklightspeed_types.go api -S

Repository: openstack-lightspeed/operator

Length of output: 2317

🏁 Script executed:

git diff --stat -- api/v1beta1/openstacklightspeed_types.go && printf '\n---\n' && git diff -- api/v1beta1/openstacklightspeed_types.go --unified=20

Repository: openstack-lightspeed/operator

Length of output: 172

🏁 Script executed:

rg -n "RELATED_IMAGE_MCP_SERVER_IMAGE_URL_DEFAULT|MCPServerImageURL|MCPServerContainerImage|rhos-mcps" -S .

Repository: openstack-lightspeed/operator

Length of output: 1449

🏁 Script executed:

rg -n "RELATED_IMAGE_.*IMAGE_URL_DEFAULT|RELATED_IMAGE_.*DEFAULT" -S config bundle manifests deploy api controllers | sed -n '1,220p'

Repository: openstack-lightspeed/operator

Length of output: 3145

🏁 Script executed:

fd -a -t f 'csv|yaml|yml' config bundle manifests deploy . | sed -n '1,200p'

Repository: openstack-lightspeed/operator

Length of output: 10746

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== related image refs ==" rg -n "RELATED_IMAGE_MCP_SERVER_IMAGE_URL_DEFAULT|RELATED_IMAGE_.*IMAGE_URL_DEFAULT" config bundle manifests deploy api controllers -S || true echo echo "== candidate manifest files ==" fd -a -t f 'csv|yaml|yml' config bundle manifests deploy . | sed -n '1,200p'

Repository: openstack-lightspeed/operator

Length of output: 13781

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== related image refs ==" rg -n "RELATED_IMAGE_MCP_SERVER_IMAGE_URL_DEFAULT|RELATED_IMAGE_.*IMAGE_URL_DEFAULT" config bundle manifests deploy api controllers -S || true echo echo "== manifest files ==" git ls-files 'config/**' 'bundle/**' 'manifests/**' 'deploy/**' | sed -n '1,200p'

Repository: openstack-lightspeed/operator

Length of output: 5126

Pin the MCP sidecar image to an immutable release.

quay.io/openstack-lightspeed/rhos-mcps:latest is still used as the default in api/v1beta1/openstacklightspeed_types.go, and the same tag is baked into config/manager/manager.yaml and the generated CSV. That makes installs and upgrades pick up whatever image is current at deploy time instead of the tested MCP binary.

Suggested change

- MCPServerContainerImage = "quay.io/openstack-lightspeed/rhos-mcps:latest" + MCPServerContainerImage = "quay.io/openstack-lightspeed/rhos-mcps:<release-or-digest>"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@api/v1beta1/openstacklightspeed_types.go` around lines 49 - 50, The MCP sidecar image is still defaulting to a mutable “latest” tag, which means deployments can drift from the tested binary. Update the default image constant in openstacklightspeed_types.go and ensure the same pinned, immutable release image is used in manager.yaml and the generated CSV so all install and upgrade paths reference the exact released MCP server image.

coderabbitai · 2026-06-24T15:22:30Z

+	dynamicWatchCRDs, err := getDynamicWatchCRDs()
+	if err != nil {
+		setupLog.Error(err, "unable to retrieve DynamicWatchCRDs")
+		os.Exit(1)
+	}


📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Drop the dead error return from getDynamicWatchCRDs.

getDynamicWatchCRDs never returns a non-nil error, and golangci-lint flags this as unparam; this can block CI until the signature and caller are simplified.

Proposed fix

- dynamicWatchCRDs, err := getDynamicWatchCRDs() - if err != nil { - setupLog.Error(err, "unable to retrieve DynamicWatchCRDs") - os.Exit(1) - } + dynamicWatchCRDs := getDynamicWatchCRDs()

-func getDynamicWatchCRDs() (map[schema.GroupVersionKind]*atomic.Bool, error) { +func getDynamicWatchCRDs() map[schema.GroupVersionKind]*atomic.Bool { dynamicWatchCRDs := map[schema.GroupVersionKind]*atomic.Bool{ { Group: "core.openstack.org", @@ } - return dynamicWatchCRDs, nil + return dynamicWatchCRDs }

Also applies to: 251-266

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cmd/main.go` around lines 193 - 197, The `getDynamicWatchCRDs` call site still treats the function as returning an error even though it never does, which triggers the `unparam` lint issue. Update `getDynamicWatchCRDs` to remove the dead error return, then simplify its callers in `main` and any other affected paths (including the other mentioned block) to stop checking `err` and just use the returned CRD value(s) directly.

Source: Linters/SAST tools

coderabbitai · 2026-06-24T15:22:30Z

+- apiGroups:
+  - ""
+  resources:
+  - secrets
+  verbs:
+  - create
+  - delete
+  - get
+  - list
+  - patch
+  - update
+  - watch


🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Narrow cluster-scoped Secret access.

This ClusterRole now grants create/delete/list/watch/update/patch on every Secret in the cluster, making the pull-secret resourceNames rule below ineffective. Keep cluster scope to the minimum external Secret reads needed for MCP/Keystone, and rely on the namespaced Role for managed Secret writes.

Suggested direction

- apiGroups: - "" resources: - secrets verbs: - - create - - delete - get - - list - - patch - - update - - watch

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- apiGroups:

- ""

resources:

- secrets

verbs:

- create

- delete

- get

- list

- patch

- update

- watch

- apiGroups:

- ""

resources:

- secrets

verbs:

- get

🧰 Tools

🪛 Trivy (0.69.3)

[error] 15-26: Manage secrets

ClusterRole 'manager-role' shouldn't have access to manage resource 'secrets'

Rule: KSV-0041

Learn more

(IaC/Kubernetes)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@config/rbac/role.yaml` around lines 15 - 26, The ClusterRole in role.yaml is granting broad cluster-wide Secret write/read permissions that override the intended `resourceNames` restriction for `pull-secret`. Tighten the `secrets` rule in this ClusterRole to only the minimal cluster-scoped Secret access needed for MCP/Keystone, and move any Secret create/update/patch/delete permissions to the namespaced Role used for managed Secrets. Keep the existing `resourceNames`-based access for `pull-secret` and verify the `ClusterRole` versus `Role` split preserves only the required external reads.

Source: Linters/SAST tools

This one is important for us to address. Not sure we want the operator to read/write every secret in the cluster

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Outline: internal/controller/common.go ==" ast-grep outline internal/controller/common.go --view expanded || true echo echo "== Outline: internal/controller/mcp_server.go ==" ast-grep outline internal/controller/mcp_server.go --view expanded || true echo echo "== Secret/ConfigMap API usage in controller code ==" rg -n -C2 --glob 'internal/controller/*.go' '\b(Get|List|CreateOrPatch|Delete|Patch|Update)\b|Secret\b|ConfigMap\b' internal/controller echo echo "== Focus: common.go relevant slice ==" sed -n '1,260p' internal/controller/common.go echo echo "== Focus: mcp_server.go relevant slices ==" sed -n '1,260p' internal/controller/mcp_server.go echo sed -n '260,620p' internal/controller/mcp_server.go echo sed -n '620,980p' internal/controller/mcp_server.go echo echo "== Focus: openstacklightspeed_controller.go relevant slices ==" sed -n '1,260p' internal/controller/openstacklightspeed_controller.go

Length of output: 161955

@umago, I re-checked the controller code and agree this is still an important issue to address.

One nuance though: the current MCP flow does need some cross-namespace Secret access in the OpenStackControlPlane namespace, so my earlier “reduce it to get only” direction was too aggressive.

What I found:

ensurePasswordSecret() creates lightspeed-password in the OSCP namespace.

reconcileACSecret() updates finalizers on the current/rotated AC Secret in that namespace.

reconcileDeleteOpenStackResources() removes that finalizer and deletes lightspeed-password.

The rest of the external Secret usage is read-only.

So the concern is valid, but the tighter fix is:

keep the namespaced Role for operator-owned Secrets in openstack-lightspeed

narrow the cluster-scoped secrets rule to only the verbs actually used cross-namespace

likely get, create, update, delete

drop cluster-wide list, watch, and patch for Secrets unless there’s another caller not in this PR

Also worth checking separately: the cluster-scoped configmaps rule looks read-only from current usage, so that one may be reducible too.

If the goal is zero cluster-wide Secret writes, that would require a code-path redesign, because today the reconciler explicitly mutates Secrets in the OSCP namespace.

coderabbitai · 2026-06-24T15:22:30Z

+  # Uncomment to enable RHOS MCPs (MCP server sidecar with OpenStack/OpenShift tools):
+  # dev:
+  #   featureFlags:
+  #     - rhos_mcps


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Avoid a second commented dev: block in the sample.

If users enable both RHOS MCPs and OKP by uncommenting both examples, the sample ends up with duplicate dev keys, so one block gets dropped. Fold rhos_mcps into the existing commented dev.featureFlags example instead.

Suggested change

- # Uncomment to enable RHOS MCPs (MCP server sidecar with OpenStack/OpenShift tools): - # dev: - # featureFlags: - # - rhos_mcps + # Uncomment to enable feature flags: + # dev: + # featureFlags: + # - rhos_mcps # Uncomment to enable OKP (Offline Knowledge Portal) as an Inline RAG source: # okp: # accessKey: okp-access-key-secret - # dev: - # featureFlags: - # - okp + # featureFlags: + # - okp # okpChunkFilterQuery: "product:(*openstack* OR *openshift*)" # okpRagOnly: true

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@config/samples/api_v1beta1_openstacklightspeed.yaml` around lines 18 - 21, The sample has a duplicate commented dev configuration block that can lead users to create two dev keys when enabling both examples. Update the OpenStack Lightspeed sample so the rhos_mcps option is folded into the existing commented dev.featureFlags example rather than adding a second dev section, using the dev and featureFlags sample entries to keep the structure unambiguous.

@umago Sounds good — consolidating this into a single commented dev.featureFlags example will keep the sample unambiguous when both RHOS MCPs and OKP are enabled.

🐇👍

coderabbitai · 2026-06-24T15:22:30Z

+// GetDeploymentVolumeSection returns a pointer to the Volume in the Deployment's PodSpec
+// whose name matches the given volumeSectionName. If no such volume is found, it returns nil.
+func GetDeploymentVolumeSection(deployment appsv1.Deployment, volumeSectionName string) *corev1.Volume {
+	for i, volume := range deployment.Spec.Template.Spec.Volumes {
+		if volume.Name == volumeSectionName {
+			return &deployment.Spec.Template.Spec.Volumes[i]
+		}
+	}
+	return nil
+}


🎯 Functional Correctness | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Inspect all GetDeploymentVolumeSection call sites after changing the helper to accept *appsv1.Deployment. rg -n -C3 --type=go '\bGetDeploymentVolumeSection\s*\(' .

Repository: openstack-lightspeed/operator

Length of output: 866

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find all references to the helper and inspect the surrounding code. rg -n -C 4 '\bGetDeploymentVolumeSection\s*\(' internal .

Repository: openstack-lightspeed/operator

Length of output: 1706

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find all references to the helper and inspect the surrounding code. rg -n -C 4 '\bGetDeploymentVolumeSection\s*\(' internal .

Repository: openstack-lightspeed/operator

Length of output: 1706

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '240,330p' internal/controller/common.go

Repository: openstack-lightspeed/operator

Length of output: 3229

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '240,330p' internal/controller/common.go

Repository: openstack-lightspeed/operator

Length of output: 3229

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '240,330p' internal/controller/common.go

Repository: openstack-lightspeed/operator

Length of output: 3229

🏁 Script executed:

#!/bin/bash set -euo pipefail # Search for all references to the helper and any direct mutation of its result. rg -n '\bGetDeploymentVolumeSection\b|GetDeploymentVolumeSection\s*\(' internal .

Repository: openstack-lightspeed/operator

Length of output: 697

🏁 Script executed:

#!/bin/bash set -euo pipefail # Search for all references to the helper and any direct mutation of its result. rg -n '\bGetDeploymentVolumeSection\b|GetDeploymentVolumeSection\s*\(' internal .

Repository: openstack-lightspeed/operator

Length of output: 697

🏁 Script executed:

#!/bin/bash set -euo pipefail # Search for all references to the helper and any direct mutation of its result. rg -n '\bGetDeploymentVolumeSection\b|GetDeploymentVolumeSection\s*\(' internal .

Repository: openstack-lightspeed/operator

Length of output: 697

🏁 Script executed:

#!/bin/bash set -euo pipefail # Search for all references to the helper and any direct mutation of its result. rg -n '\bGetDeploymentVolumeSection\b|GetDeploymentVolumeSection\s*\(' internal .

Repository: openstack-lightspeed/operator

Length of output: 697

Return a pointer into the caller’s Deployment, not a copy. GetDeploymentVolumeSection takes appsv1.Deployment by value, so the returned *corev1.Volume points into a copied struct. If callers need to mutate the volume, accept *appsv1.Deployment instead (with a nil guard).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/controller/common.go` around lines 292 - 301, GetDeploymentVolumeSection currently takes appsv1.Deployment by value, so the returned *corev1.Volume points into a copied Deployment rather than the caller’s object. Update GetDeploymentVolumeSection to accept a *appsv1.Deployment (add a nil guard) and return the matching volume from deployment.Spec.Template.Spec.Volumes so any caller mutation affects the original Deployment.

coderabbitai · 2026-06-24T15:22:30Z

+timeout: 120
+commands:
+  - script: |
+      #!/bin/bash
+      set -euo pipefail
+      oc wait --for=condition=Established \
+        crd/openstackcontrolplanes.core.openstack.org \
+        crd/keystoneapplicationcredentials.keystone.openstack.org \
+        --timeout=60s
+      oc get namespace openstack
+      oc wait --for=condition=Ready pod/mock-keystone-server -n openstack --timeout=60s
+      oc get openstackcontrolplane openstack-galera-network-isolation -n openstack


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Increase step timeout to avoid edge-timeout flakes.

Line 4 sets timeout: 120, while Lines 9-12 and Line 14 can already consume up to 120s combined, so command overhead can trigger premature step timeout.

Suggested fix

-kind: TestAssert -timeout: 120 +kind: TestAssert +timeout: 180

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

timeout: 120

commands:

- script: |

#!/bin/bash

set -euo pipefail

oc wait --for=condition=Established \

crd/openstackcontrolplanes.core.openstack.org \

crd/keystoneapplicationcredentials.keystone.openstack.org \

--timeout=60s

oc get namespace openstack

oc wait --for=condition=Ready pod/mock-keystone-server -n openstack --timeout=60s

oc get openstackcontrolplane openstack-galera-network-isolation -n openstack

timeout: 180

commands:

- script: |

#!/bin/bash

set -euo pipefail

oc wait --for=condition=Established \

crd/openstackcontrolplanes.core.openstack.org \

crd/keystoneapplicationcredentials.keystone.openstack.org \

--timeout=60s

oc get namespace openstack

oc wait --for=condition=Ready pod/mock-keystone-server -n openstack --timeout=60s

oc get openstackcontrolplane openstack-galera-network-isolation -n openstack

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/kuttl/tests/application-credentials/04-assert-mock-openstack.yaml` around lines 4 - 15, The step timeout is too tight for the commands in the mock OpenStack assertion script, which can cause flakes when the waits run close to their limits. Increase the timeout in the test step so the combined `oc wait` and `oc get` operations in the `commands` block have enough headroom; update the `timeout` setting in this KUTTL test manifest accordingly.

coderabbitai · 2026-06-24T15:22:30Z

+      for i in $(seq 1 60); do
+        if oc get secret lightspeed-password -n openstack 2>/dev/null; then
+          echo "Password secret found"
+          break
+        fi
+        if [ "$i" -eq 60 ]; then
+          echo "ERROR: Password secret not created after 120s"
+          exit 1
+        fi
+        sleep 2
+      done
+
+      PW=$(oc get secret lightspeed-password -n openstack \
+        -o jsonpath='{.data.password}' | base64 -d)
+      if [ -z "$PW" ]; then
+        echo "ERROR: Password secret has no password key"
+        exit 1
+      fi
+      echo "Password secret has password key (length: ${#PW})"
+
+      echo "Waiting for AC CR to be created..."
+      for i in $(seq 1 60); do
+        if oc get keystoneapplicationcredentials.keystone.openstack.org lightspeed \
+             -n openstack 2>/dev/null; then
+          echo "AC CR found"
+          break
+        fi
+        if [ "$i" -eq 60 ]; then
+          echo "ERROR: AC CR not created after 120s"
+          exit 1
+        fi
+        sleep 2
+      done


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Command timeout is lower than the script’s worst-case wait path.

Lines 10-20 and 31-42 can wait 240s total, but Line 67 sets timeout to 180s, so the step can be killed before the script completes its intended checks.

Suggested fix

- timeout: 180 + timeout: 300

Also applies to: 67-67

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/kuttl/tests/application-credentials/06-assert-ac-resources.yaml` around lines 10 - 42, The wait loops in the application-credentials assertion script can take longer than the step timeout, so the test may be terminated before finishing its checks. Update the timeout setting in the same script to exceed the worst-case combined wait time used by the secret and AC CR polling loops, and keep the timeout aligned with the logic in the shell block that waits for lightspeed-password and the keystoneapplicationcredentials.keystone.openstack.org resource.

coderabbitai · 2026-06-24T15:22:30Z

+      # Verify openstack-config-secret Secret
+      SECURE=$(oc get secret openstack-config-secret -n openstack-lightspeed \
+        -o jsonpath='{.data.secure\.yaml}' | base64 -d)
+      if ! echo "$SECURE" | grep -q "application_credential_secret: mock-ac-secret-abcde"; then
+        echo "ERROR: secure.yaml missing application_credential_secret"
+        echo "$SECURE"
+        exit 1
+      fi
+      echo "secure.yaml has correct application_credential_secret"
+
+      # Verify MCP config has OpenStack enabled
+      MCP_CFG=$(oc get configmap mcp-config -n openstack-lightspeed \
+        -o jsonpath='{.data.config\.yaml}')
+      if ! echo "$MCP_CFG" | grep -qP "openstack:\s*\n\s+enabled:\s+true"; then
+        # Try with a simpler grep
+        ENABLED=$(echo "$MCP_CFG" | grep -A1 "openstack:" | grep "enabled:" | tr -d ' ')
+        if [ "$ENABLED" != "enabled:true" ]; then
+          echo "ERROR: MCP config does not have openstack enabled"
+          echo "$MCP_CFG"
+          exit 1
+        fi
+      fi
+      echo "MCP config has OpenStack enabled"


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Add readiness polling for dependent MCP artifacts.

After detecting clouds.yaml, the script reads openstack-config-secret and mcp-config only once. Those resources can lag and cause intermittent failures; poll them with retries before content assertions.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/kuttl/tests/application-credentials/08-assert-mcp-credentials.yaml` around lines 42 - 64, The checks in the application-credentials KUTTL assertion script read `openstack-config-secret` and `mcp-config` only once, which can race with resource availability. Update the verification flow in this test to poll with retries and short waits before asserting on `SECURE` and `MCP_CFG`, re-fetching the Secret and ConfigMap until the expected keys are present or a timeout is reached. Keep the existing `oc get`/`grep` assertions in place, but wrap them in retry logic so the `clouds.yaml`-driven readiness is handled reliably.

coderabbitai · 2026-06-24T15:22:30Z

+      oc delete namespace openstack --ignore-not-found=true --timeout=60s
+      oc delete crd openstackcontrolplanes.core.openstack.org keystoneapplicationcredentials.keystone.openstack.org --ignore-not-found=true
+    timeout: 120


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Avoid a fixed 60s namespace-delete wait in teardown.

The hard --timeout=60s can fail on slower clusters while cleanup is still progressing, making this scenario flaky. Prefer delete+poll (or a larger bounded wait) before proceeding.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/kuttl/tests/application-credentials/12-cleanup-mock-openstack.yaml` around lines 13 - 15, The teardown in the cleanup step uses a fixed 60-second wait for namespace deletion, which can make the scenario flaky on slower clusters. Update the cleanup logic around the namespace deletion command in the test to avoid relying on a hard `--timeout=60s`; instead use delete plus polling for completion, or increase it to a safer bounded wait before removing the CRDs.

coderabbitai · 2026-06-24T15:22:30Z

+    - type: OpenStackLightspeedMCPServerReady
+      status: "True"


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Assert the MCP disabled message here as well.

Line 51 only checks that the condition is True. In this suite rhos_mcps is still off by default, so the test should also verify the disabled message; otherwise a different True path would still pass.

Suggested change

- type: OpenStackLightspeedMCPServerReady status: "True" + message: "RHOS MCP server is disabled (rhos_mcps feature flag not set)"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- type: OpenStackLightspeedMCPServerReady

status: "True"

- type: OpenStackLightspeedMCPServerReady

status: "True"

message: "RHOS MCP server is disabled (rhos_mcps feature flag not set)"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/kuttl/tests/persistent-database/04-assert-openstack-lightspeed-instance.yaml` around lines 51 - 52, The OpenStackLightspeedMCPServerReady assertion only checks for a True status, so it can pass even when the MCP is enabled through an unintended path. Update the existing assertion in this persistent-database test to also verify the disabled message, matching the other suite checks, so the condition is specifically validating the expected disabled-state behavior rather than any True result.

With this commit we start deploying the MCP server as a sidecar container of the lightspeed-service container. The MCP server deployed is the one from our own rhos-mcps repository [1]. On installation the `openshift-cli` tool is enabled when OpenStack Lightspeed is configured, and the `openstack-cli` tool is enabled when the `OpenStackControlPlane` is ready. From a security perspective since we are deploying the MCP tools in the pod's network namespace not using TLS is not a real security risk. [1]: https://github.com/openstack-lightspeed/rhos-mcps Jira: OSPRH-27075 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

For now we don't want to automatically deploy the MCP tools, because: - Releasing the rhos-mcps image may be done at a later time. - This is an experimental feature right now, so we want it disabled by default. The name of the feature flag is `rhos_mcps` as shown in the config ample. Jira: OSPRH-27075 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Initial implementation of the MCP deployment uses the credentials from the `openstackclient` pod, which means that we are not the owners of that secret, just the copy we make in the `openstack-lightspeed` namespace, so those credentials could be removed/deleted and that would break our `openstack-cli` tool. In this patch we change the credentials and we leverage the `KeystoneApplicationCredential` CR to get our own credentials. Credential Rotation is handled by the code as well. Jira: OSPRH-27075 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

umago

Still looking but I think there are some comments (specially the ones from coderabbit) that needs to be addressed

umago · 2026-06-26T14:33:39Z

+- apiGroups:
+  - ""
+  resources:
+  - secrets
+  verbs:
+  - create
+  - delete
+  - get
+  - list
+  - patch
+  - update
+  - watch


This one is important for us to address. Not sure we want the operator to read/write every secret in the cluster

umago · 2026-06-26T14:34:08Z

+  # Uncomment to enable RHOS MCPs (MCP server sidecar with OpenStack/OpenShift tools):
+  # dev:
+  #   featureFlags:
+  #     - rhos_mcps


umago · 2026-06-26T14:56:47Z

+	for gvk, seen := range r.DynamicWatchCRD {
+		if seen.Load() {
+			// Re-verify CRD still exists — it may have been uninstalled.
+			crdAvailable, err := IsCRDEstablished(ctx, helper, gvk)
+			if err != nil {
+				return err
+			}
+			if !crdAvailable {
+				seen.Store(false)
+			}
+			continue
+		}
+
+		crdAvailable, err := IsCRDEstablished(ctx, helper, gvk)
+		if err != nil {
+			return err
+		}
+
+		if !crdAvailable {
+			continue
+		}
+
+		GVKUnstructObj := &uns.Unstructured{}
+		GVKUnstructObj.SetGroupVersionKind(gvk)
+		err = r.controller.Watch(
+			source.Kind(
+				r.Cache,
+				GVKUnstructObj,
+				handler.TypedEnqueueRequestsFromMapFunc(func(ctx context.Context, o *uns.Unstructured) []ctrl.Request {
+					return r.NotifyAllOpenStackLightspeeds(ctx, o)
+				}),
+				predicate.TypedResourceVersionChangedPredicate[*uns.Unstructured]{},
+			),
+		)
+		if err != nil {
+			return fmt.Errorf("failed to set up watch for %s: %w", GetCRDName(gvk), err)
+		}
+
+		seen.Store(true)
+	}


That's a good point. Stale watches are not cleaned up when/if a CRD disappears, would be good to address this too.

umago · 2026-06-26T15:04:02Z

+
+	if err := osClient.AssignUserRole(log, "admin", userID, serviceProject.ID); err != nil {
+		return fmt.Errorf("failed to assign admin role: %w", err)
+	}


Should we cache this to say that the user was already created successfully ? Since this is being called as part of reconcileMCPServerWithOpenStack it will talk to keystone on everyr reconcile. Feels unnecessary

umago · 2026-06-26T15:17:21Z

+		Watches(
+			&apiextensionsv1.CustomResourceDefinition{},
+			handler.EnqueueRequestsFromMapFunc(r.NotifyAllOpenStackLightspeeds),
+			builder.WithPredicates(predicate.ResourceVersionChangedPredicate{}),


nit: Should we filter this ? And watch for openstackcontrolplanes.core.openstack.org specially ? Right now we are watching all CRDs in the cluster

umago · 2026-06-26T15:18:58Z

 			builder.WithPredicates(predicate.ResourceVersionChangedPredicate{}),
 		).
 		Watches(
 			clusterVersion,


follwo-up comment L407 (below)

I guess we do the same here, but I think clusterVersion is just one in the cluster but CRDs we can have many.

lpiwowar

Overall I like the PR 👍 . Thank you! I like the KeystoneApplicationCredentials part.

lpiwowar · 2026-06-29T09:50:56Z

+}
+
+// OpenStackLightspeedChecksumAnnotation is the annotation key used to store the checksum of resources.
+const OpenStackLightspeedChecksumAnnotation = "openstack.org/checksum"


nit (non-blocking): This can be moved to constants.go.

lpiwowar · 2026-06-29T09:59:33Z

+		// cache-based watch only covers the operator's namespace, so
+		// periodic reconciliation discovers OSCP instances elsewhere.
+		// Once OpenStack is detected and configured, the watch handles updates.
+		oscpWatch := r.DynamicWatchCRD[OpenStackControlPlaneGVK()]


suggestion (non-blocking): I know that there is a dependency between OpenStackControlPlane and KeystoneApplicationCredential CRDs. Meaning KeystoneApplicationCredential can not exist without OpenStackControlPlane and vice versa but it is not obvious here.

I think this part should be generic. What I'm trying to say is that there should be a for loop over r.DynamicWatchCRD and result.RequeueAfter setting should be skipped only when all dynamically watched CRDs are present.

lpiwowar · 2026-06-29T11:06:38Z

          limits:
            cpu: 500m
-            memory: 128Mi
+            memory: 256Mi


question (non-blocking): I'm just curious why exactly is this needed. What error was ecnountered?

lpiwowar · 2026-06-29T11:14:28Z

+			VolumeSource: corev1.VolumeSource{
+				Secret: &corev1.SecretVolumeSource{
+					SecretName: CombinedCABundleSecretName,
+					Optional:   toPtr(true),


question (blocking): Why do we need to set Optional == true? I think all of the resources are needed for proper functioning of the MCP server. By setting to true we are hiding errors related to missing clouds.yaml / secure.yaml or tls-ca-bundle.pem in the MCP servers logs. If we keep it set to false it will be immediately obvious what is wrong if the MCP server fails to deploy due to missing creds (or any other file ..)

lpiwowar · 2026-06-29T11:20:14Z

+	KeystoneApplicationCredentialVersion = "v1beta1"
+	KeystoneApplicationCredentialKind    = "KeystoneApplicationCredential"
+
+	// Lightspeed Service User


nit (non-blocking): The comment could be a little bit more descriptive. Hard to tell just from the constants name what they are related to.

lpiwowar · 2026-06-29T11:43:41Z

+      #!/bin/bash
+      set -euo pipefail
+      oc wait --for=condition=Established \
+        crd/openstackcontrolplanes.core.openstack.org \
+        crd/keystoneapplicationcredentials.keystone.openstack.org \
+        --timeout=60s
+      oc get namespace openstack
+      oc wait --for=condition=Ready pod/mock-keystone-server -n openstack --timeout=60s
+      oc get openstackcontrolplane openstack-galera-network-isolation -n openstack


question (blocking): Why this can not be a standard Kuttl assert using YAML manifests?

lpiwowar · 2026-06-29T11:44:41Z

+      #!/bin/bash
+      set -euo pipefail
+
+      echo "Waiting for password secret to be created..."
+      for i in $(seq 1 60); do
+        if oc get secret lightspeed-password -n openstack 2>/dev/null; then
+          echo "Password secret found"
+          break
+        fi
+        if [ "$i" -eq 60 ]; then
+          echo "ERROR: Password secret not created after 120s"
+          exit 1
+        fi
+        sleep 2
+      done
+
+      PW=$(oc get secret lightspeed-password -n openstack \
+        -o jsonpath='{.data.password}' | base64 -d)
+      if [ -z "$PW" ]; then
+        echo "ERROR: Password secret has no password key"
+        exit 1
+      fi
+      echo "Password secret has password key (length: ${#PW})"
+
+      echo "Waiting for AC CR to be created..."
+      for i in $(seq 1 60); do
+        if oc get keystoneapplicationcredentials.keystone.openstack.org lightspeed \
+             -n openstack 2>/dev/null; then
+          echo "AC CR found"
+          break
+        fi
+        if [ "$i" -eq 60 ]; then
+          echo "ERROR: AC CR not created after 120s"
+          exit 1
+        fi
+        sleep 2
+      done
+
+      # Verify AC CR spec
+      USERNAME=$(oc get keystoneapplicationcredentials.keystone.openstack.org lightspeed \
+        -n openstack -o jsonpath='{.spec.userName}')
+      if [ "$USERNAME" != "lightspeed" ]; then
+        echo "ERROR: AC CR userName is '$USERNAME', expected 'lightspeed'"
+        exit 1
+      fi
+
+      SECRET_REF=$(oc get keystoneapplicationcredentials.keystone.openstack.org lightspeed \
+        -n openstack -o jsonpath='{.spec.secret}')
+      if [ "$SECRET_REF" != "lightspeed-password" ]; then
+        echo "ERROR: AC CR secret is '$SECRET_REF', expected 'lightspeed-password'"
+        exit 1
+      fi
+
+      EDPM_SVC=$(oc get keystoneapplicationcredentials.keystone.openstack.org lightspeed \
+        -n openstack -o jsonpath='{.metadata.annotations.keystone\.openstack\.org/edpm-service}')
+      if [ "$EDPM_SVC" != "false" ]; then
+        echo "ERROR: AC CR edpm-service annotation is '$EDPM_SVC', expected 'false'"
+        exit 1
+      fi
+


question (blocking): Why this can not be a standard Kuttl assert using YAML manifests? To me it seems like there is nothing preventing it.

lpiwowar · 2026-06-29T11:47:56Z

+      oc delete openstackcontrolplane openstack-galera-network-isolation -n openstack --ignore-not-found=true
+      oc delete pod mock-keystone-server -n openstack --ignore-not-found=true
+      oc delete svc mock-keystone-server -n openstack --ignore-not-found=true
+      oc delete configmap mock-keystone-code openstack-config-mock -n openstack --ignore-not-found=true
+      oc delete secret openstack-config-secret-mock combined-ca-bundle-mock ac-lightspeed-test-secret -n openstack --ignore-not-found=true
+      oc delete namespace openstack --ignore-not-found=true --timeout=60s
+      oc delete crd openstackcontrolplanes.core.openstack.org keystoneapplicationcredentials.keystone.openstack.org --ignore-not-found=true


question (blocking): Why can't we use?

--- apiVersion: kuttl.dev/v1beta1 kind: TestStep delete:

lpiwowar · 2026-06-29T11:49:06Z

+      set -euo pipefail
+      ! oc get pod mock-keystone-server -n openstack 2>/dev/null
+      ! oc get svc mock-keystone-server -n openstack 2>/dev/null
+      ! oc get configmap mock-keystone-code -n openstack 2>/dev/null
+      ! oc get configmap openstack-config-mock -n openstack 2>/dev/null
+      ! oc get secret openstack-config-secret-mock -n openstack 2>/dev/null
+      ! oc get secret combined-ca-bundle-mock -n openstack 2>/dev/null
+      ! oc get namespace openstack 2>/dev/null
+      ! oc get crd openstackcontrolplanes.core.openstack.org 2>/dev/null
+      ! oc get crd keystoneapplicationcredentials.keystone.openstack.org 2>/dev/null


question (blocking): Why can't we use the standard -errors.yaml file to verify resources are missing in a cluster?

lpiwowar · 2026-06-29T12:09:36Z

+	helper *common_helper.Helper,
+	namespace string,
+) error {
+	rawClient, err := getRawClient(helper)


question: Can't we go without the getRawClient here (+ other places where it got introduced as part of this PR)? I do not like how the getRawClient is slowly spreading across the operator's code. I think if the customer sets the OperatorGroup correctly then WATCH_NAMESPACE will be set to openstack-operators / openstack and there is no reason to use the raw client.

Update operator-sdk to v1.41.1

b6ccce0

Align with the version used by the openstack-k8s-operators.

openshift-ci Bot requested review from lpiwowar and umago June 24, 2026 14:43

openshift-ci Bot added the approved label Jun 24, 2026

coderabbitai Bot reviewed Jun 24, 2026

View reviewed changes

Akrog and others added 3 commits June 25, 2026 10:58

Akrog force-pushed the lcore-mcps branch from f0918cb to 2fde292 Compare June 25, 2026 08:58

Akrog mentioned this pull request Jun 25, 2026

WIP: Allow changing all the container images #126

Open

umago requested changes Jun 26, 2026

View reviewed changes

lpiwowar requested changes Jun 29, 2026

View reviewed changes

		// MCPServerContainerImage is the fall-back container image for the MCP server
		MCPServerContainerImage = "quay.io/openstack-lightspeed/rhos-mcps:latest"

Uh oh!

Conversation

Akrog commented Jun 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

openshift-ci Bot commented Jun 24, 2026

Uh oh!

coderabbitai Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

umago left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lpiwowar left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Akrog commented Jun 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

coderabbitai Bot Jun 24, 2026 •

edited

Loading

coderabbitai Bot Jun 24, 2026 •

edited

Loading

lpiwowar Jun 29, 2026 •

edited

Loading